

# Machine Language Part 1

Dr. Wooi Ping Cheah

### A self-introduction

- Dr. Wooi Ping Cheah
  - —Teaching Fellow, UNNC
  - -Office: PMB323
  - -Office Hour: Wednesday, 12pm-2pm

### Research interests

- Soft Computing
- Knowledge Engineering
- Software Engineering
- Artificial Intelligence
- Programming Paradigms

### Outlines

- Introduction to machine language
- Some basic operations
- Hack basics
- Hack assembly programming

# Why learning machine language?



### Computers are flexible

Many software programs can run on the same hardware.



# Universality

Many software programs can run on the same hardware.

Theory



Alan Turing:
Universal Turing Machine

**Practice** 



John Von Nuemann:
Stored Program Computer

# Universal Turing machine

- A machine that can simulate an arbitrary machine operation on arbitrary input. (wikipedia)
  - ➤ Reading both the description of the machine to be simulated and the data input to the machine from its own tape.



# The first computer



- Designed by Charles Babbage, in 1822.
- Powered by a steam engine.
- Use Punched Cards.

# Stored program concept

#### **Computer System**



### ENIAC - first general-purpose computer



- Electronic Numerical Integrator And Computer (ENIAC).
- First Turing-complete computer.
- Announced in 1946.
- By Moore School of Electrical Engineering, University of Pennsylvania, US.

### An informal definition: machine language

 A machine language can be viewed as an agreedupon formalism, designed to manipulate a memory using a processor and a set of registers. (Nisan &

Schocken)



### List of machine languages

- ARM: 16-bit, 32-bit, 64-bit
- DEC: 12-bit, 16-bit, 18-bit, 32-bit, 36-bit, 64-bit
- Intel: 8008, 8080, 8085, Zilog Z80.
- X86: 16-bit x86, IA-32, x86-64
- IBM: 305, 650, 701, ...
- MIPS
- Motorola 6800, 68000 family
- Hack assembly

• ...

Machine language is hardware-dependent.

# Machine language at a glance

- Processor (CPU)
  - >ALU, memory access, control (branching).
  - $\triangleright$ E.g. add R1, R2, R3 // R1 ← R2 + R3.
- Memory
  - Collection of hardware devices that store data and instructions in a computer.
  - ➤ E.g. load R1, 67 // R1←Memory[67].
  - ➤ Slow access.
- Register
  - ➤ High-speed local memory.

### Machine language

**Computer System** 



#### **Handling instructions:**

- 1011 means "addition" operation
- 0001010100 means "operate on memory address 340"
- Next we have to execute the instruction at address 2-

addressing

control

15

# Compilation



Virtual machine and assembly language in between. We will come back to them later.

# Machine language

#### Binary instruction:



• Difficult to understand.

#### Assembly language:



- Symbolic machine language instructions.
- Much easier to understand,
- Use assembler to translate assembly language to binary instructions.

### Assembler

**Assembly Language** 

@i
 M=1 // i = 1
 @sum
 M=0 // sum = 0
(LOOP)
 // if i>RAM[0],
 // GOTO WRITE
 @i
 D=M
 @R0
 D=D-M
 @WRITE
 D;JGT
 ... // Etc.

Machine Language





assembler

### Recap

#### high-level program



### Outlines

- Introduction to machine language
- Some basic operations
  - ➤ Arithmetic and logic operations
  - ➤ Memory access
  - > Flow control
- Hack basics
- Hack assembly programming

# Arithmetic operations

Addition/substraction

```
    ➤ ADD R1, R2, R3  // R1 ← R2 + R3, where R1, R2, // R3 are registers.
    ➤ ADD R1, R2, index  // R1 ← R2 + index, where index // stands for the value of the // memory pointed at by the // user-defined label index. // e.g. index is RAM[129].
```

### Logic operations

- Basic boolean operations:
  - $\triangleright$  Bitwise negation  $//0 \rightarrow 1$ , or  $1 \rightarrow 0$ .
  - ➤ Bit shifting //00010111 left-shift by 2: 01011100
  - ➤ Bitwise And, Or, etc.
- Example:
  - ➤ AND R1, R1, R2 //R1 ← bitwise And of R1 and R2.

### Memory access

#### **Computer System**



Which data the instruction should operate? Check memory access address.

# Memory hierarchy

- Accessing a memory location is expensive:
  - Need to supply a long address
  - Memory to CPU: take time
- Solution: memory hierarchy:



# Memory hierarchy

| Type            | Description                                                                      | Typical storage                      | Typical speed         |
|-----------------|----------------------------------------------------------------------------------|--------------------------------------|-----------------------|
| CPU<br>Register | Quickly accessible memory location available to a CPU.                           | 48 128-Byte registers, 6 kB          | ≤1 CPU cycle          |
| CPU<br>Cache    | A hardware cache used by CPU to reduce the cost to access data from main memory. | Intel i7 (2008),<br>8 MB L3<br>cache | 3~14<br>CPU<br>cycles |
| Main<br>memory  | Random-access memory (RAM).                                                      | 4~8 GB                               | 240 CPU cycles        |
| Disk<br>memory  | Harddisk.                                                                        | 500 GB, 1 TB                         | 10~30<br>ms           |

E.g. for a 2.5GHz CPU, 1 CPU cycle  $\approx$  0.4 ns.

- The CPU typically contains a few, easily accessed, registers.
- They are the **central part** of the machine language.

#### Data registers:

add R1, R2 // R2 
$$\leftarrow$$
 R1 + R2

|            | R1 | R2 |
|------------|----|----|
| Before add | 10 | 25 |
| After add  |    |    |



- The CPU typically contains a few, easily accessed, registers.
- They are the **central part** of the machine language.

#### Data registers:

add R1, R2 // R2 
$$\leftarrow$$
 R1 + R2

|            | R1 | R2 |
|------------|----|----|
| Before add | 10 | 25 |
| After add  | 10 | 35 |



- The CPU typically contains a few, easily accessed, registers.
- They are the central part of the machine language.

#### Data registers:

add R1, R2 // R2  $\leftarrow$  R1 + R2

#### Address registers:

store R1,  $@A // @A \leftarrow R1$ 



- The CPU typically contains a few, easily accessed, registers.
- They are the central part of the machine language.

#### Data registers:

add R1, R2 // R2  $\leftarrow$  R1 + R2

#### Address registers:

store R1,  $@A // @A \leftarrow R1$ 



### Addressing modes

Register

```
\triangleright ADD R1, R2 // R2 \leftarrow R2 + R1
```

>Access data from a register R2.

Direct

```
► ADD R1, M[67] // Mem[67] ← Mem[67] + R1
```

➤ LOAD R1, 67 // R1 ← Mem[67]

Access data from **fixed memory address** 67.

Indirect

```
➤ ADD R1, @A // Mem[A] ← Mem[A] + R1
```

Access data from memory address specified by variable A.

• Immediate

```
\trianglerightADD 67, R1  // R1 \leftarrow R1 + 67
```

➤ LOADI R1, 67 // R1  $\leftarrow$  67

Access the data of value 67 immediately.

# Input / Output

- Many types of input and output devices:
  - ➤ Keyboard, mouse, camera, sensors, printers, screen, sound...
- The CPU needs some agreed-upon protocol to talk to each of them
  - Software **drivers** realize these protocols.
- One general method of interaction uses memory mapping:
  - $\triangleright$  Memory location  $A_1$  holds the direction of the last movement of the mouse.
  - $\triangleright$  Memory location  $A_2$  tells the printer to print single-side or double side.

### Flow control

#### **Computer System**



Which instruction to process next?

### Flow control

- Usually CPU executes instructions in sequence.
- Sometimes "jump" unconditionally to another location, e.g. implement a loop.

#### Example:

```
101: load R1,0
102: add 1, R1
103: ...
... // do something with R1 value
...
156: jmp 102 // goto 102
```

#### Symbolic version:

```
load R1,0
LOOP:
add 1, R1
...
// do something with R1 value
...
jmp LOOP // goto loop
```

### Flow control

- Usually CPU executes instructions in sequence.
- Sometimes "jump" unconditionally to another location, e.g. implement a loop
- Sometimes jump only if some condition is met:

#### Example:

```
jgt R1, 0, CONT  // if R1>0 jump to CONT
sub R1, 0, R1  // R1 ← (0 - R1)
CONT:
...
// Do something with positive R1
```

### Recap

- Arithmetic and logic operations
  - >Addition/substraction,
  - ➤ Bitwise operations.
- Memory access
  - ➤ Memory hierachy,
  - ➤ Data register/address register,
  - Four addressing modes,
  - ➤ Input/output memory mapping.
- Flow control
  - ➤ Run in sequence,
  - > Jump conditionally/unconditionally.

### Outlines

- Machine language
- Some basic operations
- Hack basics
  - ➤ Hack computer
  - ➤ Hack machine language
  - ➤ Hack input / output
- Hack assembly programming

## Hack computer: hardware



### A **16-bit** machine consisting of:

- Data memory (RAM): a sequence of 16-bit registers:
   RAM[0], RAM[1], RAM[2],...
- Instruction memory (ROM): a sequence of **16-bit** registers: ROM[0], ROM[1], ROM[2],...
- Central Processing Unit (CPU): performs **16-bit** instructions
- Instruction bus / data bus / address buses.

### **CPU Emulator**



## Hack computer: software



- Hack machine language:
  - > 16-bit A-instructions
  - > 16-bit C-instructions

Hack program = sequence of instructions written in the Hack machine language.

# Hack computer: start



- The ROM is loaded with a Hack program.
- When the *reset* button is pushed, the program starts running.

# Hack computer: registers



- Three 16-bit registers:
  - > D: Store data
  - > A: Store address / data of the memory
  - M: Represent currently addressed memory register: M = RAM[A]

### Instructions

- Every operation involving a memory location requires two Hack commands:
  - >A-instruction: address instruction
    - ■Set the address to operate on.

```
E.g., @17 // A \leftarrow 17.
```

- >C-instruction: command instruction
  - Specify desired operation.

```
E.g., @17 // 17 refers to memory location 17, A \leftarrow 17. 
M=1 // RAM[17] = 1. C-instruction.
```

### Outlines

- Introduction to machine language
- Some basic operations
- Hack basics
  - ➤ Hack computer
  - ➤ Hack machine language
  - ➤ Hack input / output
- Hack assembly programming

# Hack machine language



### Two ways to express the same semantics:

language



language

44

# A-instruction specification

<u>Semantics:</u> Set the A register to *value* (memory address)

### Symbolic syntax:

@ value

Example:

@21

set A to 21

#### Where *value* is either:

- > a non-negative decimal constant  $\leq$  32767 (=2<sup>15</sup>-1) or
- a symbol referring to a constant (come back to this later)

### **Binary syntax:**

0 value

Where *value* is a 15-bit binary constant

### Example:

000000000010101

set A to 21

opcode signifying an A-instruction

# C-instruction specification

```
      Syntax:
      dest = comp; jump
      (both dest and jump are optional)

      where:
      0, 1, -1, D, A, !D, !A, -D, -A, D+1, A+1, D-1, A-1, D+A, D-A, A-D, D&A, D|A

      comp =
      M, !M, -M, M+1, M-1, D+M, D-M, M-D, D&M, D|M

      dest =
      null, M, D, MD, A, AM, AD, AMD
      (M refer to RAM[A])

      jump =
      null, JGT, JEQ, JGE, JLT, JNE, JLE, JMP
```

#### Semantics:

- Computes the value of comp
- Stores the result in dest
- If the Boolean expression (comp == 0) is true, jumps to execute the instruction at ROM[A].

# C-instruction specification

Symbolic syntax:

dest = comp ; jump

Binary syntax:

opcode

1 1 1 a c1 c2 c3 c4 c5 c6 d1 d2 d3 j1 j2 j3

opcode

not used comp bits

dest bits jump bits

| COI  | пр   | <b>c1</b> | c2 | с3 | с4 | <b>c</b> 5 | <b>c</b> 6 |
|------|------|-----------|----|----|----|------------|------------|
| 0    |      | 1         | 0  | 1  | 0  | 1          | 0          |
| 1    |      | 1         | 1  | 1  | 1  | 1          | 1          |
| -1   |      | 1         | 1  | 1  | 0  | 1          | 0          |
| D    |      | 0         | 0  | 1  | 1  | 0          | 0          |
| Α    | М    | 1         | 1  | 0  | 0  | 0          | 0          |
| !D   |      | 0         | 0  | 1  | 1  | 0          | 1          |
| !A   | ! M  | 1         | 1  | 0  | 0  | 0          | 1          |
| -D   |      | 0         | 0  | 1  | 1  | 1          | 1          |
| -A   | -M   | 1         | 1  | 0  | 0  | 1          | 1          |
| D+1  |      | 0         | 1  | 1  | 1  | 1          | 1          |
| A+1  | M+1  | 1         | 1  | 0  | 1  | 1          | 1          |
| D-1  |      | 0         | 0  | 1  | 1  | 1          | 0          |
| A-1  | M-1  | 1         | 1  | 0  | 0  | 1          | 0          |
| D+A  | D+M  | 0         | 0  | 0  | 0  | 1          | 0          |
| D-A  | D-M  | 0         | 1  | 0  | 0  | 1          | 1          |
| A-D  | M-D  | 0         | 0  | 0  | 1  | 1          | 1          |
| D&A  | D&M  | 0         | 0  | 0  | 0  | 0          | 0          |
| DA   | D M  | 0         | 1  | 0  | 1  | 0          | 1          |
| a==0 | a==1 |           |    |    |    |            |            |

|   | dest | d1 | d2 | d3 | effect: the value is stored in:    |
|---|------|----|----|----|------------------------------------|
| Ī | null | 0  | 0  | 0  | The value is not stored            |
|   | М    | 0  | 0  | 1  | RAM[A]                             |
|   | D    | 0  | 1  | 0  | D register                         |
|   | MD   | 0  | 1  | 1  | RAM[A] and D register              |
|   | Α    | 1  | 0  | 0  | A register                         |
|   | AM   | 1  | 0  | 1  | A register and RAM[A]              |
|   | AD   | 1  | 1  | 0  | A register and D register          |
|   | AMD  | 1  | 1  | 1  | A register, RAM[A], and D register |
| _ |      |    |    |    |                                    |

| jump | j1 | j2 | j3 | effect:            |
|------|----|----|----|--------------------|
| null | 0  | 0  | 0  | no jump            |
| JGT  | 0  | 0  | 1  | if out > 0 jump    |
| JEQ  | 0  | 1  | 0  | if out = 0 jump    |
| JGE  | 0  | 1  | 1  | if out ≥ 0 jump    |
| JLT  | 1  | 0  | 0  | if out < 0 jump    |
| JNE  | 1  | 0  | 1  | if out ≠ 0 jump    |
| JLE  | 1  | 1  | 0  | if out ≤ 0 jump    |
| JMP  | 1  | 1  | 1  | Unconditional jump |

# C-instruction: symbolic examples

```
// Set the D register to -1 D = -1 // only constants like 0 ,1, -1 can be directly assigned to D.
```

```
// Sets RAM[300] to the value of the D register plus 1 @300 // A = 300, M refer to RAM[300] M=D+1 // RAM[300] = D + 1
```

# C-instruction: symbolic examples

### Exercise: C-instruction

```
// Set RAM[0] = 16.
```

### Exercise: C-instruction - answer

```
// Set RAM[0] = 16.

@16  // Set A = 16.

D=A  // Set D = 16.

@0  // Set A = 0.

M=D  // Set RAM[0] = 16.
```

### Quiz: C-instruction

```
// Set RAM[0] = 16, RAM[1] = 32, then swap RAM[0] and RAM[1],
// using RAM[2] as temporary variable.
```

### Quiz: C-instruction - answer

```
// Set RAM[0] = 16, RAM[1] = 32, then swap RAM[0] and RAM[1],
// using RAM[2] as temporary variable.
```

```
//RAM[1]=RAM[2]
//RAM[0] = 16;
                     //swap, RAM[2]=RAM[0]
@16
                     @0
                                             @2
                     D=M
D=A
                                             D=M
                     @2
00
                                             @1
M=D
                     M=D
                                             M=D
//RAM[1] = 32;
                     //RAM[0] = RAM[1]
@32
                     @1
D=A
                     D=M
@1
                     @0
                     M=D
M=D
```

# C-instruction: symbolic to binary

dest = comp ; jump

Symbolic:

Binary:

MD=D+1

1110011111011000

M=1

1110111111001000

D+1;JLE

1110011111000110

|      |      |    | _  | _          |            |            | _  |
|------|------|----|----|------------|------------|------------|----|
| COI  | np   | c1 | c2 | <b>c</b> 3 | <b>c</b> 4 | <b>c</b> 5 | С6 |
| 0    |      | 1  | 0  | 1          | 0          | 1          | 0  |
| 1    |      | 1  | 1  | 1          | 1          | 1          | 1  |
| -1   |      | 1  | 1  | 1          | 0          | 1          | 0  |
| D    |      | 0  | 0  | 1          | 1          | 0          | 0  |
| Α    | М    | 1  | 1  | 0          | 0          | 0          | 0  |
| !D   |      | 0  | 0  | 1          | 1          | 0          | 1  |
| !A   | ! M  | 1  | 1  | 0          | 0          | 0          | 1  |
| -D   |      | 0  | 0  | 1          | 1          | 1          | 1  |
| -A   | -M   | 1  | 1  | 0          | 0          | 1          | 1  |
| D+1  |      | 0  | 1  | 1          | 1          | 1          | 1  |
| A+1  | M+1  | 1  | 1  | 0          | 1          | 1          | 1  |
| D-1  |      | 0  | 0  | 1          | 1          | 1          | 0  |
| A-1  | M-1  | 1  | 1  | 0          | 0          | 1          | 0  |
| D+A  | D+M  | 0  | 0  | 0          | 0          | 1          | 0  |
| D-A  | D-M  | 0  | 1  | 0          | 0          | 1          | 1  |
| A-D  | M-D  | 0  | 0  | 0          | 1          | 1          | 1  |
| D&A  | D&M  | 0  | 0  | 0          | 0          | 0          | 0  |
| D A  | D M  | 0  | 1  | 0          | 1          | 0          | 1  |
| a==0 | a==1 |    |    |            |            |            |    |

| dest | d1 | d2 | d3 | effect: the value is stored in:    |
|------|----|----|----|------------------------------------|
| null | 0  | 0  | 0  | The value is not stored            |
| М    | 0  | 0  | 1  | RAM[A]                             |
| D    | 0  | 1  | 0  | D register                         |
| MD   | 0  | 1  | 1  | RAM[A] and D register              |
| Α    | 1  | 0  | 0  | A register                         |
| AM   | 1  | 0  | 1  | A register and RAM[A]              |
| AD   | 1  | 1  | 0  | A register and D register          |
| AMD  | 1  | 1  | 1  | A register, RAM[A], and D register |

| jump | j1 | j2 | j3 | effect:            |
|------|----|----|----|--------------------|
| null | 0  | 0  | 0  | no jump            |
| JGT  | 0  | 0  | 1  | if out > 0 jump    |
| JEQ  | 0  | 1  | 0  | if out = 0 jump    |
| JGE  | 0  | 1  | 1  | if out ≥ 0 jump    |
| JLT  | 1  | 0  | 0  | if out < 0 jump    |
| JNE  | 1  | 0  | 1  | if out ≠ 0 jump    |
| JLE  | 1  | 1  | 0  | if out ≤ 0 jump    |
| JMР  | 1  | 1  | 1  | Unconditional jump |

# Hack program at a glance

### Symbolic code

```
// Computes RAM[1] = 1+...+RAM[0]
    // Usage: put a number in RAM[0]
         // RAM[16] represents i
         // i = 1
    M=1
         // RAM[17] represents sum
    @17
         // sum = 0
    M=0
4
    @16
    D=M
    @0
    D=D-M
    @18
           // if i>RAM[0] goto 18
    D; JGT
10
    @16
11
    D=M
    @17
13
    M=D+M
           // sum += i
14
    @16
    M=M+1
           // i++
            // goto 4 (loop)
16
    @4
17
    0;JMP
18
    @17
19
    D=M
20
    @1
    M=D
           // RAM[1] = sum
           // program's end
    @22
23
           // infinite loop
    0;JMP
```

### **Observations:**

- Hack program:
   a sequence of Hack instructions
- White space is permitted
- Comments are welcome
- There are better ways to write symbolic Hack programs.

No need to understand for now ... We will come back to this shortly.

# Hack programs: symbolic and binary

translate

assembler

Symbolic code

```
// Computes RAM[1] = 1+...+RAM[0]
    // Usage: put a number in RAM[0]
    @16 // RAM[16] represents i
0
    M=1 // i = 1
    @17
         // RAM[17] represents sum
    M=0
         // sum = 0
4
    @16
    D=M
    @0
    D=D-M
    @18
           // if i>RAM[0] goto 18
    D; JGT
10
    @16
11
    D=M
12
    @17
13
    M=D+M // sum += i
14
    @16
15
    M=M+1 // i++
16
           // goto 4 (loop)
    @4
17
    0:JMP
18
    @17
19
    D=M
20
    @1
21
    M=D
           // RAM[1] = sum
    @22
           // program's end
23
           // infinite loop
    0;JMP
```

Binary code

execute

# Acknowlegement

- This set of lecture notes are based on the lecture notes provided by Noam Nisam / Shimon Schocken.
- You may find more information on: www.nand2tetris.org.